The COVID-19 Sentinel Schools Network of Catalonia (CSSNC) project: Associated factors to prevalence and incidence of SARS-CoV-2 infection in educational settings during the 2020–2021 academic year

The Sentinel Schools project was designed to monitor and evaluate the epidemiology of COVID-19 in Catalonia, gathering evidence for health and education policies to inform the development of health protocols and public health interventions to control of SARS-CoV-2 infection in schools. The aim of this study was to estimate the prevalence and incidence of SARS-CoV-2 infections and to identify their determinants among students and staff during February to June in the academic year 2020–2021. We performed two complementary studies, a cross-sectional and a longitudinal component, using a questionnaire to collect nominal data and testing for SARS-CoV-2 detection. We describe the results and perform a univariate and multivariate analysis. The initial crude seroprevalence was 14.8% (95% CI: 13.1–16.5) and 22% (95% CI: 18.3–25.8) for students and staff respectively, and the active infection prevalence was 0.7% (95% CI: 0.3–1) and 1.1% (95% CI: 0.1–2). The overall incidence for persons at risk was 2.73 per 100 person-month and 2.89 and 2.34 per 100 person-month for students and staff, respectively. Socioeconomic, self-reported knowledge, risk perceptions and contact pattern variables were positively associated with the outcome while sanitary measure compliance was negatively associated, the same significance trend was observed in multivariate analysis. In the longitudinal component, epidemiological close contact with SARS-CoV-2 infection was a risk factor for SARS-CoV-2 infection while the highest socioeconomic status level was protective as was compliance with sanitary measures. The small number of active cases detected in these schools suggests a low transmission among children in school and the efficacy of public health measures implemented, at least in the epidemiological scenario of the study period. The major contribution of this study was to provide results and evidence that help analyze the transmission dynamic of SARS-CoV-2 and evaluate the associations between sanitary protocols implemented, and measures to avoid SARS-CoV-2 spread in schools.


Introduction
The Coronavirus Disease 19  outbreak began in Wuhan, China in December 2019 and rapidly became an international public health emergency. As of March 2022, there had been more than 470 million cases and 6 million deaths globally [1]. The first case of COVID-19 in Spain was confirmed on January 31, 2020, and in Catalonia on February 25 [2]. Until March 2022, the Catalan region had registered more than 2 million accumulated cases and more than 26,000 deaths [3].
At the beginning of the pandemic, in March 2020, it was estimated that 107 countries and 862 million children and young people were affected by the closure of schools, one of the public health measures aiming to reduce the transmission of SARS-CoV-2 [4]. However, this number increased to 1.57 billion students worldwide over the following months [5]. Many governments chose to close schools in response to the pandemic because it has previously been shown to be an effective non-pharmacological prevention measure in the control of other virus spread like influenza [4,6] where children have had a significantly contribution [7][8][9]. Nevertheless, at the beginning of the current pandemic, data on the prevalence of COVID-19 in children was scarce due to low testing of the pediatric population [10] and the fact that parameters and evidence about COVID-19 occurrence in adults could not be extrapolated to children [11]. A great deal of effort was made to resolve this question.
Since the beginning of the pandemic, the contribution of children in the virus spread has been discussed [12]. People aged from 0 to 14 had a lower risk of SARS-CoV-2 infection compared to those of 15 to 64, additionally, among the infected, older people had more severe outcomes and reported higher mortality rates [4].
Despite estimates based in household secondary attack rates may be influenced due to several factors such as contact patterns, increased exposure, and symptomatic surveillance, which depends on the sensitivity of case detection, even several serological studies estimate highest prevalence among adults under 35 years [13,14], it is known that children are not the main source of spread of the SARS.CoV-2 virus and therefore interventions based on this public can have an impact below expected [15,16].
The impact of school closure could also cause social, economic and health problems, with emotional costs for children and young people considering the interruptions to other areas of activity in schools such as, nutrition, mental health, safety, and social assistance services [6,13,14,[17][18][19].
In Catalonia (7,739,758 inhabitants) the closure of the 5,492 schools on 13 March 2020 affected 1,582,478 students and 116,999 teaching staff. The schools were reopened on 14 September 2020, immediately after the school vacations, between June and August, remaining closed, therefore, for six months [20].
The current schools' guidelines were developed by the Government of Catalonia based on SARS-CoV-2 indicators. They include early detection and isolation as well implementation of public health measures as natural ventilation of classrooms, stable coexistence groups (SCG or bubble groups) and targeted screening [21,22], also monitoring COVID-19 risk factors, determinants, transmission dynamics, preventive measures compliance and outbreaks in the school to provide evidence to improve the safety of schools preventing further closures and their impact [4,18,23].
The COVID-19 Sentinel Schools Network of Catalonia (CSSNC) is a part of the COVID-19 monitoring and evaluation plan from the Health Department of Catalonia. The main objective is to monitor and evaluate the epidemiological situation of COVID-19 and its determinants in the educational setting, to gather evidence for the health policies aimed at the prevention and control of SARS-CoV-2 infection in schools and as a platform for other applied research projects. Currently, in 2022, the CSSNC includes 23 schools with 4,221 children and 1,140 staff from all over Catalonia, the study protocol has been previously published [24].
The aims of the CSSNC project (www.escolessentinella.org) include: the monitoring of biological markers; knowledge, attitudes and behaviors towards SARS-CoV-2 preventive measures; the identification of both facilitators and barriers to their implementation; and the monitoring of environmental indicators such as CO2, all together using a participatory research approach [24].
In this paper, to answer the question about the occurrence of COVID-19 among schoolaged population, the main objectives are to estimate the prevalence and incidence of SARS-CoV-2 exposure and infections and to identify potential associated factors associated to them, among students and staff of the CSSNC during the academic year 2020-2021. Moreover, as a secondary objective, feasibility of a by-monthly testing strategy is also assessed.

Study design and population
In this study, were included 2,007 students and 520 school staff (teaching and non-teaching staff, such as extracurricular education instructors and administrative personal) who previously signed the informed consent, from seven schools all over Catalonia. Although they are an opportunistic sample, epidemiological, and sociodemographic characteristics of the area, as well as type of school (public, private or chartered) were considered to assure heterogeneity. During the study period, we used two methodological approaches: a cross-sectional study to estimate SARS-CoV-2 prevalence, and a longitudinal study to calculate the COVID-19 incidence and evaluate the feasibility of the twice-monthly testing strategy.
For the longitudinal component, were included in a cohort of 1,424 participants, 983 students over 12 years-old and 441 school staff.
indicators were included according to COSMO questionnaire [25]. In each longitudinal round, participants filled in an additional online epidemiological survey with information related to SARS-CoV-2 infection, suspected symptoms, exposures, and vaccine status during the previous 15 days.
Three different questionnaire models were designed one for school staff (questionnaire A); one for students under 16 years, which were answered by parents/guardian (questionnaire B) and one for students over 16 years (questionnaire C).
Secondary data about vaccine coverage and socioeconomic level was provided by the Agència de Qualitat I Avaluació Sanitàries de Catalunya (AquAS) through the Primary Care Services Information System (SISAP) and Data Analytics Program for Health Research and Innovation (PADRIS), which collect programmatic data from different sources. The variable socioeconomic level was based on the sanitary regions and was used to categorize the place of residences in tertiles (high, medium and low).
Biological samples were collected from all participants. A finger prick blood was collected to perform a rapid serological anti-SARS-CoV-2 IgM/IgG test to estimate the initial and final seroprevalence in February and June 2021 respectively. Saliva and nasal swab were collected twice a month to investigate the presence of SARS-CoV-2 RNA and SARS-CoV-2 antigens.
All results were uploaded to the electronic health record, and, become available to the participants normally within 48 hours of the sample collection.

Independent variables
Factors that could have impact in outcome were referred to as independent variables, that were tested to investigate the association with the SARS-CoV-2 infection. They were categorized in sociodemographic, health status, contact patterns, knowledge and perceptions and, preventive measures. Each variable was coded according to the type of the question asked in the questionnaire (Table 1).
The samples that SARS-CoV-2 were detected were stored in the sample collection C.0001145 located at the Vall d'Hebron Hospital Universitari (Barcelona, Spain) in the sample collection registered at the Instituto de Salud Carlos III register, Madrid, Spain.Saliva samples with positive SARS-CoV-2 results were frozen and stored at the IGTP-HUGTiP Biobank, Badalona, Catalonia, Spain, and maintained for two years.

Outcomes and case definitions
Our first outcome was previous exposure to SARS-CoV-2 virus. The case definition for positive, was any individual with a positive SARS-CoV-2 IgG antibodies detected by rapid test. The second outcome was active SARS-CoV-2 infections. The case definition for positive was any individual, symptomatic or asymptomatic with a positive RT-PCR or RAT detected by the project team or detected and confirmed by RT-PCR or RAT performed by primary health care, during the follow-up period. We decided to include these self-reported documented infections because students and school staff with positive results started the isolation protocol and were no longer tested at school.

Data analysis
We calculated crude and adjusted prevalence for students of 2-20 years of age on the census in Catalan schools, adjusting for age and sex. For school staff, we only adjusted for sex and then by sensitivity and specificity of the tests. The differences between initial and final seroprevalence were only calculated for students from first grade of middle school or older and staff, using the McNemar test.
Descriptive analysis was performed, and the data were provided globally and stratified by educational stage when possible and presented considering sociodemographic and socioeconomic indicators; contact pattern; knowledge, behavior and perceptions of COVID-19 and health status. Frequency, measures of central tendency (mean and median) and dispersion (standard deviation and IQR) were calculated.
Univariate analysis was performed to investigate the association between independent variables and outcome, variables with p-value <0.050 were considered statistically significant. The prevalence ratio with 95% confidence interval (CI) were calculated using an adjusted Poisson model with robust error and adjustment for age, sex and school. The combined qualitative variables were compared using the McNemar test.
For the multivariate analysis, we proceeded an initial correlation graph (polychoric correlation) was constructed for each category of variables and of the pairs that had a correlation coefficient greater than 0.8 (in absolute value), only one of the variables was used for the analysis.
To fit a multivariate model, we performed the analysis with stepforward and stepwise regression, ie. starting with the model with all variables, then removing them one by one and subsequently starting with an empty model adding the variables one by one. We used the same type of regression as for univariate models (GLM Poisson with robust errors). We selected variables by significance, R2 and AIC, and both models (stepforward and stepwise regression) gave us the same model with R2 0.184547 and AIC 1261.318, compatible with the set of behavioral variables included in the model. We checked the Goodness of fit for the multivariate model and observed that no overdispersion was found in the Poisson model.
To calculate the incidence rate for the at-risk population, we used the number of participants with a positive result in RT-PCR or RAT divided by person time at risk. For at risk of infection we excluded individuals with RT-PCR or RAT positive result in the previous 60 days before started the cohort. The denominator was defined as the sum of the time at risk of the 1,366 participants sampled during the cohort (950 students and 416 employees). Time at risk was defined for each participant, as the difference of time between the moment that they entered the study and the endpoint when they tested positive or, if they did not obtain any positive result, the last round that they were tested. The result was presented per 100 personmonth.
In the longitudinal component, a univariate analysis was performed included the same variables that had been tested in the cross-sectional component, adding the information collected during the follow-up endpoints. These data were assessed using independent log binomial mixed models to calculate the Relative Risk (RR) with participant's identifier as random effect and adjusting for age, sex and school. Due the low number of positives, a multivariate model for the longitudinal component was not proceed.
For the univariate analysis, a GLMM log binomial model with ID as random intercept was used to estimate the PR and RR of each variable of interest, by means of an adjusted measure and avoiding the confusion of the variables age, sex and school.
Two composite indicators were created to measure the participant's knowledge of COVID-19: "perceived knowledge" and "factual knowledge" and another indicator to measure the "risk perception". The "perceived knowledge" and "risk perception" were measured using a Likert 7-point scale, 1-4 scores were considered as low level of knowledge or low level of risk perception and 5-7 as high level. And "factual knowledge" was measured by a binary score composed of three aspects: people at risk, symptoms and means of transmission. The answers were counted, and we classified as high level of factual knowledge when more than 50% of answers were correct and 50% or less as low.
All analyses were carried out with R (version 4.1.0). Confidence intervals for incidence were obtained using the 'epi.conf' function from 'epiR' package The number of samples that should be tested to find a positive was calculated using Ene 3.0.

Ethics statement
The Foundation University Institute for Research in Primary Health Care Jordi Gol i Gurina (IDIAPJGol) approved the study on 17 December 2020 (code 20/192-PCV). Informed consent was obtained from school staff, parents for those children under 16 and alumni aged 16 or older. Participants were free to decline/withdraw consent at any time without providing a reason and without being subject to any resulting detriment.

Results
For participants students, except for the preschool, most participants were female (55%) overall. Regarding socioeconomic variables, 821 (41.7%) students' fathers and 1,043 mothers (52.3%) have high levels of study or university qualifications, 1,705 (86.7%) of the students' fathers and 1613 (81.0%) of the students' mothers were employed in the study period ( Table 2).
Regarding epidemiological data for SARS-CoV-2 exposure risk, for students, the most common place of contact with suspected or confirmed case was at the school, being 65 (65.0%) in preschool group for school staff, 122 (63.2%) reported having contact with a suspected or confirmed case at the school (Table 4).

Seroprevalence, univariate and multivariate analysis in the cross-sectional component
The baseline seroprevalence of SARS-CoV-2 IgG for students and school staff was, respectively, 14 Table 5). Among those participants who had two serological tests at baseline and at the end of the longitudinal component (round four), there was a significant increase in prevalence (p<0.001). The main differences were in the staff group (p-value <0,001), although there was also a no significant increase among students in vocational studies (Fig 2).
The variables included in the univariate analysis, for students and staff were presented into sociodemographic, health and behaviors and contact patterns categories. Indiscriminate changes in the employment situation (PR 1.43, CI 1.07-1.91) and improved the economic situation (PR 2.66 CI 1.18-6.00) regarding parents and school staff were positively associated with having been infected. The variable higher perceived knowledge was positively associated with the infection (PR 1.68 CI 1.05-2.68). The public health measure, avoiding crowded spaces was negatively associated with the infection (PR 0.65 CI 0.45-0.93) and a higher risk perception was positively associated (PR 1.49 CI 1.14-1.93) ( Table 6).

PLOS ONE
Associated factors of SARS-CoV-2 infection in educational settings during the 2020-2021 academic year  In the case of contact patterns, having unspecific contact with suspected or confirmed COVID-19 case, (PR 2.76; CI 1.94-3.93) or contact at home (PR 2.17; CI 1.62-2.91) were positively associated with the infection, in contrast, contact at school, had a strongly negative association (PR 0.60 CI 0.45-0.80). Living with a health professional was not associated with infection (p-value = 0.262). (Table 7).
In the multivariate analysis, the same significance and trend among socioeconomic, health measures and contact patterns variables were observed (Table 8).

Incidence and univariate analysis at the longitudinal component
During the longitudinal component of the study, 45 new infections occurred (34 students and 11 staff), 11 of them identified by RT-PCR performed by the project team and 34 self-reported in by the participants. It is interesting to note that out of 11 RT-PCR positives identified in the study, only 1 (9%) was also detected by RAT.
The overall incidence was 2.73 (95% CI 1.991, 3.653) per 100 person-month, that is 2.887 (95% CI 1.999, 4.034) and 2.337 (95% CI 1.167, 4.182) per 100 person-month for students and staff, respectively. The variables included in the univariate analysis were also categorized into sociodemographic and socioeconomic indicators, health status, and preventive compliance. There was a protective behavior associated with socioeconomic status, when comparing the highest level (high) in reference to the first tercile (low) (RR 0.25, 95% CI 0.06-0.96) for the COVID-19 infection (Table 9).

Table 6. Summary and univariate analysis between SARS-CoV-2 infection and sociodemographic, health and behavioral indicators of students and staff from senti
Contact with a suspected or confirmed case of COVID-19, was a risk factor for infection (RR 6.44, 95% CI 3. 15-13.19). When the contact occurred at home, the risk (RR 12.42, 95% CI
We tested several sanitary measures that had been carried out in the last seven days before the survey and only avoiding close contact with someone who is infected or at risk (RR 0.38, 95% CI 0.15-0.97) and a wearing mask (RR 0.14, 95% CI 0.04-0.53) were associated, this is compatible with the also significant result of the variable contact with a suspected or confirmed cases of  We tested the feasibility of a twice a month RT-PCR testing strategy. Considering a prevalence of 0.07% and accuracy of 0.05) 1,258 participants should be tested to find one positive.

Discussion
In Switzerland, the "Ciao Corona" study, conducted in June/July 2020, October/November 2020, and March/April 2021 with 2,585 children, found 2.8% (95%CI 1.6-4.1%) SARS-CoV-2 IgG, IgM and IgA seroprevalence [26,27]. In Germany, a study conducted during May and June 2020 founded 0.6% seroprevalence for students and school staff and 0.7% at the follow up, in September/October 2020 [28]. A population study carried out with children under 18 years of age in Catalonia found a lower seroprevalence than what we found in our study, of 4.4% between March and April 2020. This difference may reflect the difficulty of diagnosing asymptomatic youngers especially during the initial period of the pandemic, emphasizing the importance of active surveillance of school sentinel populations, for the timely detection of respiratory viruses [29].
As expected, in our study there was a significant increase in SARS-CoV-2 seroprevalence, in the school staff group, which can be explained by the increase of vaccination coverage. According to PADRIS data the vaccine coverage in school staff went from 78% and 0.6% in April 2021 to 84.3% and 35.6% in June 202, partly and fully vaccinated respectively. At the time of seroprevalence data collection in this study, vaccines were not approved for people under 18 years.
The prevalence of active SARS-CoV-2 infections detected by the project was low considering the overall prevalence and incidence from Catalonia during the same period [3,20,30,31]. This suggests that public health strategies such as testing of symptomatic individuals and contact tracing efforts were effective at identifying an active infection at school, even the asymptomatic population [28]. Another study proceeded during in December 2020, in a high community transmission period in Switzerland [17] found, a positive PCR in none of the teacher and one child and Antigen positive test in 7 (1.1%) children and 2 (3.0%).
Considering detected and self-reported infections in our longitudinal study we found a low incidence of COVID-19 infections, consistent with other studies that have very similar results to ours [28,[32][33][34]. Also, there are studies suggesting that higher community incidence, diagnostic issues [30], demographic and economic aspects are determinants in the variation of different rates detected, as showed in these studies [16,35,36].
The association between socioeconomic status and SARS-CoV-2 infection was different depending on the period of data collection. First, at the beginning of the pandemic, improved economic situation was positively associated with having been infected. This could be explained by the fact that the most affected population were those who worked and travelled than those who were respecting the lockdown measures. Then, during the follow-up we observed a new trend where a higher infection risk was associated with lower economic status. This provides important clues to understanding the COVID-19 burden in different economic and demographic contexts [37]. Population-based studies found similar results where heterogeneity in incidence and mortality rates [32,35,36] were associated with socioeconomic status showing the importance of planning sanitary policies oriented to the territorial characteristics and specific inequities [38,39], such as in a follow-up study in Brazil that found a high incidence in children living in a slum area [16].
At baseline, contact with suspected or confirmed cases, especially at home, was positively associated with SARS-CoV-2 infection, as observed in a study that found that physical distancing measures, including limited close contacts while school remained open, controlled SARS-CoV-2 transmission [40]. However, school contacts had a negative association with this outcome, showing how well-implemented sanitary protocols make the safe opening of schools possible, consistent with other studies that found an association between low transmission and, sanitary recommendations and preventive measures [6,41].
In the longitudinal component analysis, all contact patterns were a risk factor, especially when contact was at home, consistent with previous studies that demonstrated an increased risk of infection associated with household contacts in Catalonia [15,31,42] and modeling studies that demonstrated an increased risk for infection in household contacts [43,44]. Even contact tracing studies found no typical or frequent child-adult transmission [16,45,46] and a low contribution of children in the secondary cases [47,48], which showed that children do not seem to be the main source of infection [11,49].
Interestingly, contact with healthcare professionals was not associated with the infections in our study. Our hypothesis to explain that is the compliance of preventive measures at home when health care workers were exposed to risk situations, however, we need more studies to understand the role of HCW in this transmission model.
Perceived knowledge was positively associated with infection, which may indicate either knowledge acquired due to a previous infection, or the large amount of lay knowledge consumed by the young and indeed the general population [49,50].
Knowledge of COVID-19 and risk perception may have been due to the occurrence of a previous infection, which would not explain the occurrence of a later infection. A high level of risk perception of exposure might indicate they understood the risks they had taken. Other studies also show that risk perceptions, behaviors and compliance with sanitary measures are associated with levels of knowledge [51][52][53].
As with other studies [15,17,47] our results reinforce that the transmission by children in the school setting did not appear to make a major contribution to the spread of the virus, especially for the youngest children. This supports the decision of many countries to keep schools open while following several public health measures and safety protocols to control the transmission of the virus. Our study also reinforces the idea that the strategy based on an active sentinel surveillance for detection of acute SARS-CoV-2 infections followed by isolation of bubble groups seems to be more effective in scenarios with susceptible groups and rapid transmission.
Approximately half of the target population agreed to participate and considering the difficult circumstances schools and families were experiencing because of the pandemic; we consider this proportion to be quite acceptable. As a matter of fact, it is similar to other studies where 75% and 25% of students and staff participated respectively [28] or with 49% of child participation [17].

Limitations
Although the overall participation rate in our study was 45.4%, it was proportionately higher among school staff (72%) than students (41%), this suggests that given the fact that higher sociodemographic heterogeneity (nationality, language, socioeconomic status) was higher among students than staff, some of these factors could have also influenced participation.
There were no difficulties in implementing the study and all sentinel schools gave us excellent feedback for the associated activities.
Because of the sample of the schools and the participation rate, the study population may not be representative of all schools in Catalonia. Nevertheless, the heterogeneity of the included school's information from different socioeconomical scenarios with a big enough study population. As a sentinel population approach, the objective of the CSSNC is not to extrapolate parameters, but to complement formal epidemiological surveillance systems by means of monitoring them steadily over time population studied. The data presented were gathered before the Omicron variant circulation and therefore our findings may not apply during the Spanish sixth wave or other future scenarios related to potential new variants and vaccine recommendations for children.
As a cross-sectional design, association should be interpreted with caution, without attributing causality. Variables such as distal characteristics must be interpreted differently from variables that can change over time such as knowledge, behavior, and contact patterns, which are influenced by the occurrence of the disease. Moreover, the acceptability, compliance, and prevention behaviors, may have been directly affected by the course of the pandemic.
There were some limitations to our longitudinal analysis as the small number of acute infections made impossible to apply a multivariate analysis. Also, with community public health measures occurring simultaneously with the schools' own protocols it was difficult to evaluate these determinants separately. In addition, there was a poor distribution of confounders between groups, which can also have very different sizes, resulting in a loss of statistical power in a multivariate model.

Conclusions
This study offers a unique perspective on the prevalence and incidence of SARS-CoV-2 infection among students and school staff in Catalonia, an important result considering the difficulty of detecting the virus among asymptomatic young people, as well as regarding the compliance and effectiveness of public health measures implemented in these schools in the transmission of SARS-CoV-2.
The CSSNC demonstrated, for the first time in Spain, the feasibility of correlating individual socio-epidemiological data and data on the prevalence and incidence of SARS-CoV-2 in the school environment, even during the difficult acute period of the pandemic. Despite the high prevalence and community incidence of SARS-CoV-2 in Catalonia during the study period, this project found a low prevalence and incidence of active infections in the school population, suggesting that the prevention methods adopted by schools, together with other strategies of health care, such as testing and contact tracing, were effective in containing transmission in educational settings.
Monitoring of SARS-CoV-2 biological markers and their behavioral and structural determinants over time in sentinel schools is crucial to assess the situation of the COVID-19 pandemic and provide relevant information to inform guidelines and policies to increase safety among students and staff in school environments. Apart from identifying multilevel transmission determinants for SARS-CoV-2 among students and school staff, they may also be useful to describe the spread of other infectious diseases such as influenza and other respiratory viruses and facilitate healthier learning environments for all.